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[57] ABSTRACT 

A method and apparatus for linear vocal control of 



cursor position within a computer display system. A 
microphone is utilized in conjunction with a computer 
system to detect vocal utterances and each vocal utter- 
ance is then coupled to an analysis circuit to detect 
voiced and unvoiced vocal utterances. Variations in the 
pitch of each voiced vocal utterance and the virtual 
frequency of each unvoiced vocal utterance arc then 
utilized to linearly vary the position of a cursor in the 
computer display system in two axes independently. In 
a depicted embodiment of the present invention the 
analysts circuit includes a shon delay to ensure that a 
valid control signal has occurred. Thereafter, increases 
or decreases in pitch or virtual frequency from an initial 
value are utilized to initiate movement by the cursor in 
a positive or negative direction in the two axes. Cursor 
motion will persist until pitch or virtual frequency re- 
turn to an initial value or until the utterance ceases. In 
one embodiment of the present invention the appear- 
ance of the cursor is graphically altered to the indicate 
the presence of a valid control signal. 



12 Claims, 2 Drawing Sheets 
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lions in the pitch of each voiced vocal utterance and the 
METHOD AND APPARATUS FOR LINEAR virtual frequency of each unvoiced vocal utterance are 

VOCAL CONTROL OF CURSOR POSITION then utilized to linearly vary the position of a cursor in 

the computer display system in two axes independently. 
BACKGROUND OF THE INVENTION 5 | n a depicted embodiment of the present invention the 

1. Technical Field analysis circuit includes a short delay to ensure that a 
The present invention relates in general to improved valid control signal has occurred. Thereafter, increases 

data processing systems and in particular to a method or decreases in pitch or virtual frequency from an initial 
and apparatus for cursor control within a data process- value are utilized to initiate movement by the cursor in 
ing system. Still more particularly, the present inven- 10 a positive or negative direction in the two axes. Cursor 
lion relates to a method and apparatus for the linear motion will persist until pitch or virtual frequency re- 
vocal control of cursor position in a data processing turn to an initial value or until the utterance ceases. In 
system. one embodiment of the present invention the appear- 

2. Description of the Related An ance of the cursor is graphically altered to indicate the 
Voice control of various mechanical and/or electri- 15 prc sence of a valid control signal. 

cal devices is well known in the art. In hand occupied 

environments or among the physically challenged, the BRIEF DESCRIPTION OF THE DRAWING 

accurate control of such devices is a much desired tech* - nie nove] f ealures believed characteristic of the tn- 

noJo 6 v - , n vention are set forth in the appended claims. The inven- 

Known control devices for electrical appliances ™ t{Qn jlself howeverf M we n ^ a preferred mode of use, 
range from simple power relays whtch apply or remove fimh£r objeas ^ advamages , hereof( wm bc 
power from an appliance in response to the sound of a undcrslood b reference to the following detailed de- 
wrustle or the clappmg of hands, to sophist.cated com- Qf fln jUustralive embodiment when read in 

puter control devices which permit complex commands con y unction whh lhc accompanyirig draw ings, wherein: 
to be entered verbally. For example, telephone systems i> _ J ,_ ... T ., • • i 

exist which automatically dial an outgoing telephone 1 15 * P*™ 11 * «*™<» ( partially pictorial 

call in response to a verbal command identifying a de- representation of a computer system wh.ch may be 
sired individual utilized to implement the method and apparatus of the 

Modern computer systems often utilize a so-called P resent invention; 
"Graphic User Interface" or "GUI" to permit com- 30 FIG. 2 is a schematic representation of the interface 
puter users to access computer applications and manipu- circuit of FIG. 1; and 

late computer files in an intrinsically natural manner by. FIGS. 3o-3c are pictorial representations of a graphi- 
graphically designating graphic representations, or cally altered cursor created in accordance with the 
icons, and manipulating those icons in a manner well method and apparatus of the present invention, 
known in the art. Such graphic designation typically 35 DESCRIPTION OF PREFERRED 

takes place utilizing a graphic pointing device, such as a FMROniMFNT 
mouse or light pen, to relocate a "cursor" which selects tM 

the desired computer application. With reference now to the figures and in particular 

Vocal control of a cursor has been attempted in cer- with reference to FIG. 1, there is depicted a partially 
tain state-of-the-art computer systems by recognizing 40 schematic, partially pictorial representation of a com- 
ccrtain command speech utterances such as "UP,'* puter system 10 which may be utilized to implement the 
"DOWN," "LEFT," "RIGHT," and "STOP." This method and apparatus of the present invention. As illus- 
approach has proven cumbersome for fine cursor con- trated, computer system 10 includes a display device 12 
trol due to the number of iterations which are typically wnich is ut jij 2ea t0 provide a display screen 14 in a 
necessary to position a cursor at a desired location. 45 manner we i| known j n t he art. Depicted within display 
Other systems permit gross positioning of a cursor by screen K is cursor l6 Cursor 16 is dcpictea in an arrow 
mapping the cursor to various regions of a display snapcd ernbodirnenl ; however, those skilled in the art 
screen in response to selected vowel sounds; however, wi]| appreciale that curs0 rs may be depicted in multiple 
as above, th.s system is cumbersome for fine cursor d]ffmal Coupled to display device 12 is proces- 

positioning. , . f 50 sor 18. Processor 18 includes the central processing unit 

Therefore, it should be apparent that a need exists for ef 1Q . ^ w k ^ 

a system which permits the accurate control of cursor 2Q ^ar * ^ wc „ 

position .» a computer system by verbalized commands. ^ ^ fa ^ computer ^ ^ 

SUMMARY OF THE INVENTION 55 the computer user to key in various command sequen- 

It is therefore one object of the present invention to c « ™ d for dat * to P*™ 1 interaction with various 
provide an improved computer system. computer applications. 

It is another object of the present invention to pre- Those sk,lled > n the u P° n reference to this 

vide an improved method and apparatus for cursor specification, appreciate that computer system 10 may 
control within a computer system. 60 °e implemented utilizing any personal computer system 

It is yet another object of the present invention to well known in the prior art such as the PS/2 IBM Per- 
provide an improved method and apparatus for linear sonal Computer manufactured by International Busi- 
vocal cursor control within a computer system. ness Machines Corporation of Armonk, N.Y. Coupled 

The foregoing objects are achieved as is now de* to processor 18 is a graphic pointing device 22. In the 
scribed. A microphone is utilized in conjunction with a 65 depicted embodiment of the present invention, graphic 
computer system to detect vocal utterances and each pointing device 22 comprises a mouse. Graphic pointing 
vocal utterance is then coupled to an analysis circuit to device 22 is then coupled to mouse driver 24 within 
detect voiced and unvoiced vocal utterances. Varia- processor 18 in a manner well known in the computer 
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an. Of course, mouse driver 24 may be implemented in non-repetitive nature and spectra] separation. Sibilant 

either hardware or software. quality is also entirely independent of pitch even in its 

In accordance with an important feature of the pres- creation, 
ent invention, a microphone 26 is also coupled to pro- Experimentation with the method and apparatus of 
cessor 18. Microphone 26 is utilized, in accordance with 5 the present invention has proven that an experienced 
the depicted embodiment of the present invention, to user may create an unvoiced vocal utterance utilizing 
permit vocal utterances to be utilized to control the the "S" sound and thereby control movement of a cur- 
positioning of cursor 16 within display screen 14 in a sor in one axis without affecting the other axis. Simi- 
manner which will be depicted in greater detail herein. " larly, a user may hum a "M" sound, which is entirely 
The analog output of microphone 26 is. in accordance 10 devoid of sibilance, to control the pitch-associated axis 
with the illustrated embodiment of the present inven- only. Finally, an experienced user may utilize the "Z" 
tion, coupled to interface circuit 28. Interface circuit 28 sound to control both axes simultaneously, 
is preferably an interface circuit such as the one dis- Referring now to FIG. 2, there is depicted a sche- 
closed in greater detail with respect to FIG. 2 and malic representation of interface circuitry 28 of FIG. 1 
which is utilized to convert vocal utterances of the type 15 which may be utilized to implement the method and 
disclosed herein to control signals which are then cou- apparatus of the present invention. As is illustrated, the 
pled to mouse driver 24 and utilized to linearly relocate output of microphone 26 is simultaneously coupled to 
cursor 16 within display screen 14. high pass filter 30 and low pass filter 52. In the depicted 
Rather than utilize spoken commands such as "UP," embodiment of the present invention, high pass filter 30 
"RIGHT," "STOP," etc., the method and apparatus of 20 and low pass filter 52 both have a cutoff frequency of 
the present invention utilizes two linear and indepen* approximately 500 hertz and serve to minimize cross 
dent acoustic characteristics of speech so that cursor talk effects between voiced speech utterances and un- 
positioning may be done with any degree of resolution. voiced or sibilant voiced utterances. 
Further, the acoustic characteristics utilized are se- First referring to high pass filter 30, the processing of 
lected such that the sounds required to attain a selected 25 an unvoiced or sibilant speech utterance may be illus- 
positioning of the cursor are easily predictable. trated. The output of high pass filter 30 is, in the de- 
One acoustic characteristic which may be utilized as picted embodiment of the present invention, preferably 
a candidate for cursor control is pitch. Most human coupled to analog-to-digital converter 32. Thereafter, 
beings can sustain a steady pitch or make the pitch of a this signal may be processed in the digital domain in any 
vocal utterance rise and fall continuously over at least 30 manner well known in the digital computer art. The 
one octave. Pitch may be easily sustained regardless of output of analog-to-digital converter 32 is then coupled 
vowel utterance, i.e., despite formant or frequency to spectrum analysis circuit 34. 

shifts. The continuity and independence of pilch makes Spectrum analysis circuit 34 preferably performs 

this an excellent acoustic characteristic for the method spectrum analysis on an unvoiced speech utterance by 

and apparatus of the present invention. 35 averaging multiple Fast Fourier Transforms. An indica- 

Amplitude may also be continuously varied by most tion of the presence of sibilant energy is also preferably 
speakers; however, amplitude may vary as a function of generated by spectrum analysis circuit 34 and coupled 

pitch due to the speaker's vocal tract resonances and to lock-on-hysteresis circuit 38 to denote the presence 

radiation characteristics and due to room acoustics. or absence of an unvoiced speech utterance. The output 

Further, high amplitude outputs will exhausl the speak- 40 of spectrum analysis circuit 34, representing the average 

er's breath more readily and perhaps disturb other users spectrum of an unvoiced speech ulterance is then cou- 

in the same physical location. pled to virtual frequency determination circuit 36. The 

Therefore, the preferred embodiment of the present virtual frequency of an unvoiced speech utterance is 

invention utilizes unvoiced speech utterances, such as calculated from the averaged spectrum of an unvoiced 

the fricatives or sibilant qualities of speech. Unvoiced 45 utterance, by summing the levels within each frequency 

speech is physically created by turbulent airflow pass- bin to derive an overall area under the spectrum. There- 

ing through a constricted portion of the vocal tract such after, the spectra is convolved with a linear function 

as that formed by the tongue against the palate. The and summed to provide a linearly convolved area 

sibilants, such as the "S" sound, create a "noise cone" which may be utilized to form a ratio with the simple 

consisting of a band of random noise frequencies 50 area to provide a linear indicator of the center or virtual 

roughly centered about a "virtual" or "apparent" fre- frequency of the unvoiced utterance, 

quency. This noise cone is brought about when the The output of block 36, representing the center or 

white noise created by the turbulence of airflow in the virtual frequency of the unvoiced utterance is then 

vocal tract is ricgeneratively filtered by the tongue-teeth coupled to time weighted averaging circuit 40. Time 

cavity and the teeth-lips cavity. 55 weighted averaging circuit 40 is utilized in conjunction 

Control of the cavity resonances by a speaker, by with the presence of absence signal generated by lock- 
physically changing the shape of the vocal tract may be on hysteresis circuit 38 to provide smoother operation, 
utilized to shift the virtual frequency of the noise out- The presence or absence indicators for both voiced and 
put. This control of the virtual frequency of an un- unvoiced speech preferably undergo hysteresis process- 
voiced sibilant utterance is exercised in a manner such 60 ing so that transient noises or dropouts do not generally 
as whistling through one's teeth. cause control activation or deactivation. 

Those skilled in the art will appreciate that there are The output of time weighted averaging circuit 40 is 

numerous advantages to utilizing the virtual frequency then amplified and coupled to output amplifier 50 for 

of an unvoiced ulterance, such as sibilant speech, as an utilization as a control signal which may be coupled to 

axis control input to complement pitch. In addition to 65 mouse driver 24 (see FIG. 1) to control the movement 

the essential quality of being able to continuously vary of cursor 16 in a first axis. In the depicted embodiment 

the virtual frequency over a wide range, this acoustic of the present invention, a second output of amplifier 50 

characteristic is easily separable from pitch due to its is preferably utilized to alter the graphic appearance of 
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cursor 16 in a manner which will be depicted in greater 
detail herein. This graphic alteration of the appearance 
of cursor 16 may be utilized to readily indicate the pres- 
ence of a valid control signal in one axis. 

Referring back to low pass filter 52, the processing of 
a voiced utterance will be illustrated. As above, the 
output of low pass filter 52 is coupled to an analog-to- 
digiial converter circuit 54. Thereafter, the digital sig- 
nals generated thereby are processed in a manner famil- 
iar to those skilled in the digital computer art. The 
output of analog-to-digital convener 54 is preferably 
coupled to pitch extraction circuit 56. Pitch extraction 
circuit 56 may be implemented utilizing any pitch ex- 
traction circuit well known in the art; however, the 
depicted embodiment of the present invention may 
efficiently utilize an additive auto-correlation pitch 
extractor of the type well known in the art. As above, a 
presence or absence signal from pitch extractor circuit 
56 is generated and coupled to lock-on hysteresis circuit 



10 



15 



output of amplifier 70 is preferably coupled to mouse 
driver 24 (see FIG. 1) to control the motion of cursor 16 
in a second axis. Also, a second output of amplifier 70 
may be utilized to graphically alter the appearance of 
cursor 16 in a manner which will be explained in greater 
detail herein. 

With reference now to FIGS. Za-lc. there altered 
cursor 16 which have been created in accordance with 
the method and apparatus of the present invention. As is 
illustrated, in FIGS. 3c-3c a cursor 16 is depicted within 
display screen 14 of computer system 10 (sec FIG. 1). 
Within FIG. 3a cursor 16 includes a darkened portion 
72 in the left half of the arrowhead forming cursor 16. 
The presence of a darkened portion 72 in the left half of 
arrowhead 16, indicates, in the depicted embodiment of 
the present invention, the presence of a valid control 
signal in the horizontal axis. Thus, a user of computer 
system 10 attempting to verbally relocate cursor 16 may 
visually ascertain thai a valid control signal has been 



58 to be utilized in the manner described above to pre- 20 verbally generated by referring to the display of cursor 
vent the transitory nature of noise signals or dropouts 
from activating or deactivating a controlled signal. 

Referring again to pitch extractor circuit 56, an out- 
put indicating the pitch of a voiced utterance is then 
coupled to timed weighted averaging circuit 60 to be 25 
processed in the manner described above with respect 
to an unvoiced utterance. As above, hysteresis process- 
ing of the presence or absence indication is utilized to 
ensure that a valid control signal is present. In the de- 



16. 

Of course, a second darkened area 74 may be utilized, 
as illustrated in FIG. 36, to indicate the presence of a 
valid control signal in the vertical axis. In this manner, 
the user will be assured that he or she is accurately 
generating a proper verbal command to relocate cursor 
16 in the vertical direction. 

Finally, as illustrated in FIG. 3c, darkened portions 
72 and 74 may be graphically depicted within cursor 16 



picted embodiment of the present invention, a delay of 30 in those cases in which an operator may simultaneously 
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generate both a voiced and unvoiced control utterance 
causing cursor 16 to move in an oblique manner as 
illustrated in FIG. 3c 

Upon reference to the foregoing, those skilled in the 
art will appreciate thai the applicant has provided a 
novel method whereby a cursor may be linearly con- 
trolled over fine movements utilizing vocal utterances. 
By utilizing selected linearly variable acoustic charac- 
teristics of these vocal utterances to linearly relocate 



approximately 100 milliseconds is utilized to qualify the 
presence or absence of a control signal. 

Once an unvoiced control signal is deemed valid by 
enduring the qualification period, the output of the 
lock-on hysteresis circuit 38 becomes active whereupon 35 
a leading edge detector 42 recognizes the acquisition of 
lock-on condition and causes the time- weigh ted aver- 
age circuit 40 to initialize to the value present at its input 
from block 36. The output of the leading edge detector 
42 also activates a signal gate 44 to store the initial value 40 cursor 16 within display screen 14 and by carefully 
output from averaging circuit 40 in a memory 46. selecting these acoustic characteristic such that a user 
Thereafter, all subsequent readings are compared by may easily vary the characteristic linearly over a se- 
diflerential comparator 48 to the reference values lected range, the method and apparatus of the present 
stored in memory 46 to calculate a relative increase or invention provides a technique whereby the location of 
decrease which has occurred since the beginning of an 45 cursor 16 within display screen 14 may be rapidly and 
utterance. Thus, if the user initiates control and then accurately controlled. 

increases sibilant virtual frequency, the velocity of the While the invention has been particularly shown and 
cursor will increase in a positive direction. Addition- described with reference to a preferred embodiment, it 
ally, the output of lock-on hysteresis circuit 38 acts to will be understood by those skilled in the art that van- 
enable output amplifier 50 so that signals are passed to 50 ous changes in form and detail may be made therein 
the mouse driver only during a lock-on condition. without departing from the spirit and scope of the in- 

The same interaction described above applies to the vention. 
initial lock-on of the voiced aspects of an utterance by I claim: 

the functions of the leading edge detector 60, signal gate 1. A method in a computer system including a display 
64, reference memory 66, differential comparator 68 55 screen having a cursor displayed at a selected position 



and amplifier 70. Likewise, if the user initiates control 
and then increases pitch, the velocity of the cursor will 
increase in a positive direction along a determined axis. 

In the depicted embodiment of the present invention, 
the motion of the cursor will persist as long as the ele- 60 
vated pitch is sustained. The motion of the cursor may 
be stopped by either returning to the initial pitch uti- 
lized to initiate the control utterance or by stopping the 



therein, for linear vocal control of fine cursor move- 
ment, said method comprising the steps of: 
detecting a vocal utterance; 

determining at least one linearly variable acoustic 

characteristic of said vocal utterance; and 
linearly varying the position of said cursor in re- 
sponse to variations in said at least one acoustic 
characteristic. 
2. The method in a computer system including a 



utterance altogether. Of course, those skilled in the art 

will appreciate that by lowering the pitch or virtual 65 display screen having a cursor displayed at a selected 

frequency of the control utterance to a value below this position therein, for linear vocal control of fine cursor 

initial value the cursor may be caused to move in the movement according to claim l t wherein said step of 

opposite direction along a respective axis. As above, the determining at least one linearly variable acoustic char- 
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acteristic comprises the step of determining the pitch of acoustic input means for detecting a vocal utterance; 

said vocal utterance. utterance analysis means coupled to said acoustic 

3. The method in a computer system including a input means for determining at least one linearly 
display screen having a cursor displayed at a selected variable acoustic characteristic of said vocal utter- 
position therein, for linear vocal control of fine cursor 5 ance; and 

movement according to claim 1, further including the cursor position means coupled to said utterance anal- 
step of providing a graphic indication within said dis- ysis means for linearly varying the position of said 
play screen of the detection of said vocal utterance. cursor in response to variations in said at least one 

4. The method in a computer system including a acoustic characteristic. 

display screen having a cursor displayed at a selected 10 8. The apparatus for linear vocal control of cursor 
position therein, for linear vocal control of fine cursor position in a computer display system having a cursor 
movement according to claim 3, wherein said step of displayed therein according to claim 7, wherein said at 
providing a graphic indication within said display least one linearly variable acoustic characteristic corn- 
screen of the detection of said vocal utterance com- prises pitch. 

prises the step of graphically altering the appearance of 15 9. The apparaius for linear vocal control of cursor 

said cursor. position in a computer display system having a cursor 

5. The method in a computer system including a displayed therein according to claim 7, wherein said 
display screen having a cursor displayed at a selected acoustic input means comprises a microphone, 
position therein, for linear vocal control of fine cursor 10. The apparatus for linear vocal control of cursor 
movement according to claim 1, wherein said step of 20 position in a computer display system having a cursor 
determining at least one linearly variable acoustic char- displayed therein according to claim 7, wherein said 
acteristic of said vocal utterance comprises the step of utterance analysis means comprises means for detecting 
determining a first linearly variable acoustic character- voiced vocal utterances and unvoiced vocal utterances, 
istic and a second linearly variable acoustic characteris- 11. The apparatus for linear vocal control of cursor 
tic of said vocal utterance and wherein the position of 25 position in a computer display system having a cursor 
said cursor is varied linearly in a first axis in response to displayed therein according to claim 10, wherein said 
variations in said first acoustic characteristics and var- utterance analysis means includes means for determin- 
ied linearly in a second axis in- response to variations in ing the pitch of a voiced vocal utterance and wherein 
said second acoustic characteristic. said cursor position means linearly varies the position of 

6. The method in a computer system including a 30 said cursor in a first axis in response to variations in said 
display screen having a cursor displayed ai a selected pitch. 

position therein, for linear vocal control of fine cursor 12. The' apparatus for linear vocal control of cursor 

movement according to claim 5, wherein said step of position in a computer display system having a cursor 

determining said second linearly variable acoustic char- displayed therein according to claim 11, wherein said 

acteristic comprises the step of determining the virtual 35 utterance analysis means includes means for determin- 

frequency of an unvoiced vocal utterance. ing the virtual frequency of an unvoiced vocal utter- 

7. An apparatus for linear vocal control of cursor ance and wherein said cursor position means linearly 
position in a computer display system having a cursor varies the position of said cursor in a second axis in 
displayed at a selected position therein, said apparatus response to variations in said virtual frequency, 
comprising: 40 • • * * * 
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